A statistical framework for power calculations in ChIP-seq experiments

نویسندگان

  • Chandler Zuo
  • Sündüz Keles
چکیده

MOTIVATION ChIP-seq technology enables investigators to study genome-wide binding of transcription factors and mapping of epigenomic marks. Although the availability of basic analysis tools for ChIP-seq data is rapidly increasing, there has not been much progress on the related design issues. A challenging question for designing a ChIP-seq experiment is how deeply should the ChIP and the control samples be sequenced? The answer depends on multiple factors some of which can be set by the experimenter based on pilot/preliminary data. The sequencing depth of a ChIP-seq experiment is one of the key factors that determine whether all the underlying targets (e.g. binding locations or epigenomic profiles) can be identified with a targeted power. RESULTS We developed a statistical framework named CSSP (ChIP-seq Statistical Power) for power calculations in ChIP-seq experiments by considering a local Poisson model, which is commonly adopted by many peak callers. Evaluations with simulations and data-driven computational experiments demonstrate that this framework can reliably estimate the power of a ChIP-seq experiment at different sequencing depths based on pilot data. Furthermore, it provides an analytical approach for calculating the required depth for a targeted power while controlling the false discovery rate at a user-specified level. Hence, our results enable researchers to use their own or publicly available data for determining required sequencing depths of their ChIP-seq experiments and potentially make better use of the multiplexing functionality of the sequencers. Evaluation of power for multiple public ChIP-seq datasets indicate that, currently, typical ChIP-seq studies are powered well for detecting large fold changes of ChIP enrichment over the control sample, but they have considerably less power for detecting smaller fold changes. AVAILABILITY Available at www.stat.wisc.edu/~zuo/CSSP. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Statistical Framework for the Analysis of ChIP-Seq Data.

Chromatin immunoprecipitation followed by sequencing (ChIP-Seq) has revolutionalized experiments for genome-wide profiling of DNA-binding proteins, histone modifications, and nucleosome occupancy. As the cost of sequencing is decreasing, many researchers are switching from microarray-based technologies (ChIP-chip) to ChIP-Seq for genome-wide study of transcriptional regulation. Despite its incr...

متن کامل

ALDEx2: ANOVA-Like Differential Expression tool for compositional data

Fundamentally, many high throughput sequencing approaches generate similar data: reads are mapped to features in each sample, these features are normalized, then statistical difference between the features composing each group or condition is calculated. The standard statistical tools used to analyze RNA-seq, ChIP-seq, 16S rRNA gene sequencing, metagenomics, etc. are fundamentally different for...

متن کامل

Spatio-temporal model for multiple ChIP-seq experiments.

The increasing availability of ChIP-seq data demands for advanced statistical tools to analyze the results of such experiments. The inherent features of high-throughput sequencing output call for a modelling framework that can account for the spatial dependency between neighboring regions of the genome and the temporal dimension that arises from observing the protein binding process at progress...

متن کامل

Genome analysis A novel statistical method for quantitative comparison of multiple ChIP-seq datasets

Motivation: ChIP-seq is a powerful technology to measure the protein binding or histone modification strength in the whole genome scale. Although there are a number of methods available for single ChIP-seq data analysis (e.g. ‘peak detection’), rigorous statistical method for quantitative comparison of multiple ChIP-seq datasets with the considerations of data from control experiment, signal to...

متن کامل

A novel statistical method for quantitative comparison of multiple ChIP-seq datasets

MOTIVATION ChIP-seq is a powerful technology to measure the protein binding or histone modification strength in the whole genome scale. Although there are a number of methods available for single ChIP-seq data analysis (e.g. 'peak detection'), rigorous statistical method for quantitative comparison of multiple ChIP-seq datasets with the considerations of data from control experiment, signal to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 30 6  شماره 

صفحات  -

تاریخ انتشار 2014